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Abstract 

Joint user selection (US) and vector precoding (US-VP) is proposed for multiuser multiple-input multiple-output (MU-MIMO) 
downlink. The main difference between joint US-VP and conventional US is that US depends on data symbols for joint US-VP, 
- - - , whereas conventional US is independent of data symbols. The replica method is used to analyze the performance of joint US-VP 

in the large-system limit, where the numbers of transmit antennas, users, and selected users tend to infinity while their ratios 
' are kept constant. The analysis under the assumptions of replica symmetry (RS) and 1-step replica symmetry breaking (IRSB) 

' implies that optimal data-independent US provides nothing but the same performance as random US in the large-system limit, 

whereas data-independent US is capacity-achieving as only the number of users tends to infinity. It is shown that joint US-VP can 
5_( ■ provide a substantial reduction of the energy penalty in the large-system limit. Consequently, joint US-VP outperforms separate 

CL(' US-VP in terms of the achievable sum rate, which consists of a combination of vector precoding (VP) and data-independent US. 

I In particular, data-dependent US can be applied to general modulation, and implemented with a greedy algorithm. 

Index Terms 

^ ■ 

Multiuser multiple-input multiple-output (MU-MIMO) downlink. Multiple-input multiple-output broadcast channel (MIMO- 
I I ' BC), zero-forcing transmit beamforming, user selection, vector precoding, energy penalty, achievable rate, large-system analysis, 

, order statistics, statistical physics, replica method, replica symmetry breaking (RSB). 

I-H . 

C/3 ■ 

O ' I. Introduction 
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ULTIPLE-input multiple-output (MIMO) systems use multiple transmit and receive antennas to increase the spectral 
efficiency H], 121. In early work, point-to-point MIMO or multiuser MIMO (MU-MIMO) uplink was investigated |3|- 
, El. In these MIMO systems, the receiver can utilize all received signals to detect the transmitted data. Recent research activities 
^— I ' have been shifted to MU-MIMO downlink, in which one base station (BS) communicates with non-cooperative users. In the 
C — ■ MU-MIMO downlink the main part of signal processing is at the transmitter side, whereas it is at the receiver side for the 
MU-MIMO uplink. 

Tj" Transmit strategies used for the MU-MIMO downlink depend on duplexing. For the MU-MIMO downlink with frequency- 
division duplexing (FDD), channel state information (CSI) is not available at the transmitter side. Instead, the BS may utilize 
limited feedback information about channel quality, transmitted through the uplink channels |9l, ifTOl . For the MU-MIMO 
. . downlink with time-division duplexing (TDD), on the other hand, channel state information (CSI) is used to pre-cancel inter- 
^ user interference (lUI) at the transmitter side. The CSI may be estimated by utilizing the fact that fading coefficients in both 
k>( links are identical for TDD ifTTl . ifTSI . In particular, it is possible for the BS to attain accurate CSI when the coherence time is 
sufficiently long. In this paper, the MU-MIMO downlink with TDD is considered under the assumption that the coherence time 
is sufficiently long. For simplicity, we assume that perfect CSI is available at the transmitter and that the number of receive 
antenna for each user is one. 

The MU-MIMO downlink we consider is mathematically modeled as the MIMO broadcast channel (MIMO-BC) with perfect 
CSI at the transmitter Recent excellent papers lfT3l - lfT6l have proved that the capacity region of the MIMO-BC with perfect 
CSI at the transmitter is achieved by dirty -paper coding (DPC) [17], which is a sophisticated scheme that pre-cancels lUI at 
the transmitter side. Since DPC is infeasible in terms of the computational complexity, however, it is an active research area 
and the target in this paper to construct a suboptimal scheme that achieves an acceptable tradeoff between performance and 
complexity. 

Zero-forcing transmit beamforming (ZFBF) llT8l -ll20l is a simple approach for pre-cancelling lUI at the transmitter side. The 
ZFBF decomposes the MIMO-BC into per-user interference-free channels. A drawback of the ZFBF is that energy penalty, 
which is the energy required for the pre-cancellation of lUI, increases rapidly as the number of (supported) users gets closer 
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to the number of transmit antennas. An increase of the energy penalty resuhs in a degradation of the receive signal-to-noise 
ratio (SNR). 

The number of users is commonly larger than the number of transmit antennas for MU-MIMO downlink. In order to reduce 
the energy penalty, user selection (US) has been proposed ll2Tl - ll24l : A subset of users is selected to mitigate the increase 
of the energy penalty. Interestingly, it has been shown that a greedy algorithm of US can achieve the sum capacity of the 
MIMO-BC when only the number of users tends to infinity [|23|, [|25 1. This result can be understood as follows: If the channel 
vectors for all users were orthogonal to each other, the ZFBF would be optimal. However, there are dependencies between 
the channel vectors in general. In US, the BS attempts to select a subset of users with almost orthogonal channel vectors. It 
is possible to pick a finite number of almost orthogonal channel vectors from an infinite number of channel vectors under 
proper conditions. Thus, the ZFBF with US can achieve the sum capacity of the MIMO-BC when the number of users tends 
to infinity. Since the number of selected users should be comparable to the number of transmit antennas, the interpretation 
above implies that the performance of US degrades significantly as the number of transmit antennas gets closer to the number 
of users. 

The situation in which the number of transmit antennas is comparable to the number of users is becoming practical [12j|. 
As an alternative limit representing this situation, we consider the large-system limit in which the number of transmit antennas 
and the number of users tend to infinity while their ratio is kept constant. 

Vector perturbation or vector precoding (VP) is an effective pre-coding scheme that works well in the large-system limit ll26l - 
ll28l . In VP, the data symbols are modified to take values in relaxed alphabets to reduce the energy penalty. As relaxed alphabets, 
lattice-type alphabets |26| and a continuous alphabet Ii27i have been proposed. In this paper, VP schemes with lattice-type 
and continuous alphabets are referred to as "lattice VP (LVP)" and "continuous VP (CVP)," respectively. The search for a 
modified data symbol vector to minimize the energy penalty is NP-hard for LVP, so that LVP is infeasible for large alphabets 
or a large number of users. On the other hand, the search for CVP reduces to a quadratic optimization problem [29] , which 
may be solved by using an efficient algorithm. The large-system analysis in EtI . Il30l has been shown that the performance 
of CVP is comparable to that of LVP in the large-system limit. In this paper, we only focus on CVP. 

A drawback of CVP is that the modulation is restricted to quadrature phase shift keying (QPSK). This restriction results in 
poor performance especially for the high SNR regime. In this paper, we propose a novel precoding scheme that is applicable 
for any modulation. The basic idea is to combine US and VP (US-VP). Joint US-VP we propose should not be confused with 
separate US-VP [|28i . in which a subset of users is first selected on the basis of CSI and subsequently VP is performed for 
the selected users. The crucial difference between the two schemes is that US depends on the data symbols for joint US-VP, 
whereas it is independent of the data symbols for separate US-VP. In this paper, joint US-VP is simply referred to as US- VP. 

Data-dependent US (DD-US) proposed in our previous work (3T\ can be regarded as a special example of US-VP: It is 
equivalent to US-VP with the original alphabet as the relaxed alphabet. DD-US allows us to use any modulation, as conventional 
US does. Furthermore, DD-US can be easily implemented with a suboptimal greedy algorithm for DD-US llSTI . 

The goal of this paper is to assess the performance of US-VP. For that purpose, we consider the large-system limit in which 
the number of transmit antennas, the number of users, and the number of selected users tend to infinity while their ratios are 
kept constant. The replica method is used to analyze the performance of US-VP in the large-system limit. The replica method 
was originally developed in statistical physics ll32l - ll34l . and has been used to analyze the performance of MIMO systems l27l . 
Il30l . ll35J - L4rj since Tanaka's pioneering work p2l . 

The weakness of the replica method is that the method is based on several non-rigorous assumptions, such as the commu- 
tativity between the large-system limit and the other limits, replica continuity, and replica symmetry (RS). The commutativity 
was justified for a spin glass model [43], called Sherrington-Kirkpatrick (SK) model. The validity of replica continuity is open. 
The RS assumption may be broken for several models. In this case, the assumption of replica symmetry breaking (RSB) should 
be considered ll44l . The RS assumption corresponds to the situation under which an energy function, called free energy, is 
unimodal. On the other hand, the RSB assumption corresponds to the situation under which the free energy has many local 
minima |33|. The simplest (strongest) assumption for RSB is referred to as 1-step RSB (IRSB). The most complex (weakest) 
assumption for RSB is called full-step RSB (fuU-RSB), which includes the RS assumption and the other lower-step RSB 
assumptions. In this paper, only the RS and IRSB assumptions are considered, since the assumption of higher-step RSB yields 
numerically unsolvable results for our problem. Thus, the results presented in this paper should be regarded as an approximation 
for the true ones. 

Recently, the validity of several results obtained from the replica method has been investigated. Korada and Montanari f45l 
proved Tanaka's formula based on the RS assumption. Guerra and Talagrand's excellent works [46], [4?] proved that the 
replica analysis under the fuU-RSB assumption provides the correct result for the SK model. The latter methodology might be 
applicable for our problem. 

This paper is organized as follows: After summarizing the notation used in this paper, in Section HI] we introduce the MIMO- 
BC and US-VP. Section HU] summarizes the main results of this paper. In Section |IV] we present numerical results based on 
the main results. Section |V] concludes this paper The derivations of the main results are summarized in the appendices. 
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A. Notation 

For a complex number z G C, the real and imaginary parts of z are denoted by 3?[z] and ^[z], respectively. Furthermore, 
z* stands for the complex conjugate of z. For a matrix A, the transpose, conjugate transpose, trace, and the determinant 
of A are denoted by A^ , A}^, TrA, and del A, respectively, /^r stands for the N x N identity matrix. Ijv represents the 
A^-dimensional vector whose elements are all one. diagjai, . . . ,aN} stands for the N x N diagonal matrix with a„ as the 
nth diagonal element. The Kronecker product operator between two matrices is denoted by (g). 

For a set A = {a^ : i = 1, . . . , N}, \a,; stands for the set {a^' : for all i' 7^ i} obtained by eliminating from A. Similarly, 
\Ai denotes the union Ui'^iAi' for sets {.4^}^;^. The direct product Ai x ■ ■ ■ x An is denoted by HiLi -^i- 

For a random variable X, E[X] and W[X] stand for the mean and variance of X, respectively. For the sequence of real 
random variables {Xi}^^, denotes the ith order statistic of {Xi}, i.e. < • • • < ^(jv) ll48l - M{m^ S) represents a 
real Gaussian distribution with mean m and covariance matrix S. Similarly, CAf{m, S) stands for a proper complex Gaussian 
distribution with mean m and covariance matrix S li49J . 

For a discrete random variable X, the entropy of X is denoted by H{X). If X is a continuous random variable, h{X) 
represents the differential entropy. For two random variables X and Y, the mutual information between X and Y is denoted 
by I{X\ Y). Throughout this paper, all logarithms are taken to base 2, while the natural logarithm is denoted by In. 

Finally, we summarize several functions used in this paper The function 5{-) represents the Dirac delta function. For a 
proposition P, the indicator function 1(P) is defined as 

The probability density function (pdf) of a circularly symmetric complex Gaussian random variable with variance is denoted 
by 

Pcg{z; ct^) = — for z G C. (2) 
IT a 

Similarly, the pdf of a zero-mean real Gaussian random variable with variance cr^ is written as 

1 x2 

Pg{x] cr^) = for a; G M. (3) 

V Sttct^ 



The standard Gaussian measure Dx is defined as 



Furthermore, the function Q{x) is given by 



Dx ^ PG{x]\)dx. (4) 
Q{x) = / Dy. (5) 

J X 



II. System Models 

A. MIMO Broadcast Channel 

We consider the MIMO-BC which consists of one BS with N transmit antennas and K users with one receive antenna. For 
simplicity, Rayleigh block-fading channels are assumed: The channel gains between the BS and each user are fixed during 
Tc time slots, and at the beginning of the next fading block the channel gains are independently sampled from a circularly 
symmetric complex Gaussian distribution. Let yk.t G C denote the received signal for user k in time slot t. The receive vector 

= (yi,t, . . . , yK.t)^ G consisting of all received signals in time slot t is given by 

yt = Hut+nt, < = 0,...,Tc-l. (6) 

In rit — (?^l,t, . • . , fiK.t)^ ^ CJ\f{0, NqIk) denotes the additive white Gaussian noise (AWGN) vector The vector 
Ut G is the transmit vector in time slot t, which will be defined shortly. Each row vector of the channel matrix H G C^^^ 
corresponds to the channel gains between the BS and each user. It is natural for the MIMO-BC to assume that each element of 
the channel matrix is 0(1) and that the time-average transmit power is constrained to below P. For convenience in analysis, 
we make an equivalent assumption: We assume that H has independent circularly symmetric complex Gaussian elements with 
variance and that the time-average transmit power is constrained to below NP, i.e. 

1 ^'^^ 

-J2M'<NP (7) 

t=o 

Under these assumptions, the transmit SNR is defined as P/N^. 

Slow fading is considered in this paper, i.e. Tc — 00. Note that the channel matrix in (|6]l is fixed during Tc time slots. In 
this situation, we can assume that the channel matrix H is known to the transmitter, since the transmitter can estimate the 
channel matrix on the basis of pilot signals transmitted from each user in a negligibly small portion of one fading block. 
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B. Zero-Forcing Transmit Beamforming 

Let Xf — (xi.t, . . . , XK.t)"^ G denote the data symbol vector in time slot t. The set xjf'^^^ = {xk.t '■ t = 0, . . . ,Tc — 1} 
corresponds to the data symbols sent to user k, and is assumed to be independent for different k. Throughout this paper, 
power allocation is not considered: The data symbols {xk,t} are assumed to be independent and identically distributed (i.i.d.) 
zero-mean complex random variables with unit variance. 

The BS uses the information about the channel matrix to pre-cancel lUI. For K < N, the simplest method for pre-cancellation 
is ZFBF [20J, in which the transmit vector Ut is given by 



with 

ui^^\H,Xt) = H''(^HH''y\f (9) 
In (O, the energy penalty £{H, {xt}) is defined as the time-average power of the ZFBF vectors (|9), 

£{H,{xt}) = ^Y. Ur\H,x,) 



(10) 



^ t=o 



It is straightforward to confirm that the transmit vector (jS] satisfies the power constraint d?). 
The ZFBF (O decomposes the MIMO-BC (O into per-user channels 



/ NP 

\^£{H,{xt}) 

for all k. This implies that the receive SNR is given by N P / {£{H ,{xt})NQ). The drawback of the ZFBF is an increase of 
the energy penalty ( fTOl l. Substituting the ZFBF vector (|9]l into ( fTOb yields 

£{H, {xt}) =- ^ xf (hh'') X, (12) 



->Tr|(^/fif"j |, (13) 

in Tc — > oo. The Marcenko-Pastur law [501 implies that the energy penalty (fTjt per user converges almost surely to 

^£{H,{xt})^-^, (14) 
K 1 — a 

in the large-system limit, where both K and iV tend to infinity with their ratio a = KjN kept constant. The asymptotic energy 
penalty (fT4b diverges as a gets closer to 1 from below. Since the receive SNR N P/ {£{H , {xt})N[)) is inversely proportional 
to the energy penalty, this divergence results in a fatal degradation of the receive SNR. 



C. Vector Precoding 

As a method for improving the drawback of ZFBF, VP with ZFBF was proposed ll26l . Il27l . In VP, each data symbol Xk,t 
is modified to take values in a relaxed alphabet t ^ depending on the original data symbol Xk^t, to reduce the energy 
penalty. The modified data symbol vector Xt € rifc=i -^Xk t based on the minimization of the energy penalty (fTSl i is given by 

xt^ argmin xf (hH^) ^ Xt. (15) 



Note that the modified vector ( fTSl l to minimize each instantaneous power \\u[^^\h , Xt)\\^ for the ZFBF (|9]l minimizes the 
energy penalty (fT2] i for the ZFBF. 

Example 1 (CVP). Suppose that QPSK is used. For CVP [27], the relaxed alphabet Mx for a QPSK data symbol x is given 
by 

Mx = M3i[x] + iMcj[x], (16) 



with 



, [5, ex.) forb=l/V2 
^"^^ {-oo,b] forb=-l/V2. ^^^^ 
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Mailer et al. I[27]l showed that the CVP results in a significant reduction of the energy penalty, compared to the conventional 
ZFBF. The minimization problem f |i5| ) with M6^ reduces to a quadratic optimization problem [29], so that one can use an 
efficient algorithm to solve f |i5| ). 

The point of the CVP is that the modified data symbol vector Xt depends on the channel matrix H. Consequently, the 
energy penalty £{H ,{xt]), given by (fTOl l. for the CVP never tends to (fT3T l in Tc — > oo. In fact, the energy penalty for the 
CVP was shown to be bounded in the limit a ^ 1 after taking the large-system limit [|301 . 



D. Joint User Selection and Vector Precoding 

We propose US- VP based on the combination of US and VP. US-VP is performed every T Tc) time slots. Let JCi C 
A^aii = {1, • • • , K}, with size K =\1C^\ (< N), denote the set of selected users in the ith block of US (i = 0, . . . , T^/T - 1). 
The corresponding modified data symbol vectors are denoted by xidA G IlfceK; * fo'" ^ — . . . , (i + 1)T — 1. The set 

of selected users /Cj and the corresponding modified vectors {xx,i,t '■ t ~ iT, . . . ,{i + 1)T — 1} are selectecQ to minimize the 
energy penalty (flOl l: 

{ICi,{x!Ci,t}) ^ aigmin £i{H!c,,{xic^^t}), (18) 



where the minimization is taken over {ICi C /Caii : |/Cj| = K} and {xjd^t S H/ceK; -^^k t ■ ^ — ■•.,(« + l)r — 1}, with 

(j+l)T-l 

y ^ \ (19) 

t=iT 

In (dill, the ZFBF vector i6|^^^(i?yc. ,X]Ci,t) is given by Furthermore, Hk:^ G ([^AxAf (jgjjotes the channel matrix 
corresponding to the selected users ICt, which is obtained by collecting the row vectors for the selected users ICi from the 
channel matrix H. 

Example 2 (DD-US). DD-US is defined as the US-VP U8i with the original alphabet as the relaxed alphabet, i.e. M.xk t ~ 
{xk,t}- Thus, the modified data symbol vector xjc^.t is equal to the original data symbol vector xjc^.t E C^, obtained by 
stacking the data symbols {xk,t} for the selected users ICi. The minimization problem ( li^l ) for the DD-US can be approximately 
solved by using a greedy algorithm proposed in [31]. 

Example 3 (US-CVP). Suppose that QPSK is used. Joint US and CVP (US-CVP) is defined as the US-VP with the 
CVP ( 1761 ). Unfortunately, the minimization problem U8i is not convex. It may be possible to extend the greedy algorithm for 
the DD-US 113 IV to the US-CVP. Obviously, the obtained algorithm should be more complex than the greedy algorithm for the 
DD-US. 

The transmit vector Ut for US-VP ( fTSl l in time slot t is given by 



where the energy penalty £{{H]Ci}, {a;yCi,t}) for the US-VP is defined as 

Tc/T-l 

S{{H^J,{x^^,t})^jrj^ £^{H^„{x^^,t}), (21) 

' i—O 

with ( fT9] l. In order to simplify detection in each user, the data symbols for non-selected users are discarded at the transmitter 
side LUJ. This implies that the ZFBF with US- VP (O decomposes the MIMO-BC ® into per-user channels 



/ NP 

yk,t = \ cittT — TT"^ TT{sk,iXk.t + (1 - Sk.i)Ik.t} + nk,t, (22) 

for all k. In ( |22] ). Xk^t G -^Xk t denotes the modified data symbol corresponding to the original data symbol Xk,t- The variable 
Sk.i G {0, 1} indicating whether user k has been selected in block i is defined as 

/ 1 k e K-i 

^'=•' = 1 k^lC.. ^23) 
Furthermore, Ik.t G C denotes the interference to the non-selected user k ^ JCi, given by 

Ik,t^hku'i^''\H^^,x^^,t), (24) 

' If there are multiple solutions, one solution is selected randomly and uniformly. 
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where hk G C^^^ denotes the kth row vector of the channel matrix H. Note that the indices t of yk^t and t in ( l22l i are 
identical to each other, since the data symbols for the non-selected users k ^ JCi have been discarded. This simplifies the 
detection of (|23] l. 

It is easy for user k to blind-detect one variable Sk.i from the T observations {yk,t] in each block. Using the decision- 
feedback of Xk.t from the decoder improve the accuracy of detection [3T1. In order to reduce the energy penalty, small T 
should be used. On the other hand, too small T makes it difficult to detect. As one option, dozens of time slots should be 
used as the block length T. For example, the energy loss due to detection errors is at most 0.2-0.5 dB for T — IQ 131]. 

Remark 1. Let us discuss the relationship between the DD-US and conventional US. The set of selected users /Cq the first 
block for the DD-US is given by 

/Co = argmin fol-H'/Co, {a^/Co^t}), (25) 

Ko<ZK.^ir.\K.\=k 

with ^19h On the other hand, when the minimization of the energy penalty f liiH for the ZFBF or equivalently of H12\l in Tc — > cx3 
is used as the US criterion, the set of selected users K, for conventional US is given by 

IC = argmin liui £o{HK:,{xK:,t}), (26) 



where we have re-written the coherence time as T. Note that ( 1261 ) is independent of the data symbols, since the object 
function tends to Tr{(if;c-H^^)^^}- 

The minimization and the limit in f |26| ) is not commutative. It is straightforward to prove the inequality 

lim min fol-f^Ko, {a^Ko.t}) 

< min lim fo(-H"/Co,{a5/Co.t})- (27) 

Comparing ( 1251 ) and ( 1261 ), we find that the energy penalty of the DD-US in T ^ go provides a lower bound on that of the 
conventional US. Let us prove the inequality A27\l . We start with a trivial inequality 

min ^ £o{Hk:o,{xkoA) < ^o{Hk:, {xkA), (28) 

fCQCfCau:\Ko\ — K 

where K- C /Call with \IC\ — K denotes the set of selected users ( |26| ) for the conventional US. We next take the limit T -> oo. 
Since K. is independent of the data symbols, we can use the weak law of large numbers for the right-hand side (RHS) of ([28i 
to find that £q{Hic, {xjc.t}) converges in probability to Tt{{H icH^)^^} or equivalently the RHS of i27\l in T oo. Thus, 
we obtain the inequality l[27\l . 

III. Main Results 

A. Large-System Analysis 

We use the replica method to analyze the performance of US-VP in the large-system limit where the number of transmit 
antennas N, the number of users K, and the number of selected users K tend to infinity while their ratios a = K/N and 
K = K/K are kept constant. Without loss of generality, we focus on the first block i = of US- VP and drop the subscripts i 
from £i{H^^,{xK:iA}), ICi, and Sfc,,;. 

The asymptotic performance of US-VP is characterized via a solvable US-VP problem. We first define the solvable problem. 
For a positive parameter q, let us define a random variable Ek{{xk.t}, q) as 

1 

£'fc({Sfc,t},g) = y X! I^*^'* ~ V^^k^^- (29) 

t=Q 

In ( |29] ), {zk.t} are independent circularly symmetric complex Gaussian random variables with unit variance. The normalized 
parameter q/{aK) will be shortly shown to be equal to the average energy penalty per selected user in the large-system limit. 
The solvable US-VP problem is the following minimization problem: 

Ek = min min ^ g). (30) 

The asymptotic performance of US-VP is characterized via three quantities for (|30] | in the large-system limit. The mini- 
mization in ( |30] | with respect to {xk^t} is straightforwardly solved to obtain 

Ek^ min (31) 



keK 
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with 

T-l 



T 

t=0 

where i (g) denotes the optimal modified data symbol, given by 



i?/c(g) = ^El4TH9)-V^^Mp, (32) 



^'kT\l) = argmin |ifc,t - V^^fc.tP- (33) 

In order to solve the minimization (ISTI analytically, we write the order statistics for the random variables {Ek{q)} as {E(^i;-^{q)}, 
i.e. E(^i){q) < ■ ■ ■ < E(K){q) iR-Sl . The minimization dsTT l reduces to 

(34) 



K ^ \K 

k=l ^ 

The three quantities that characterizes the asymptotic performance of US-VP is the mean and variance for (|34l in the 
large-system limit, and the Kth order statistic E^^^^ (q) for ( |32] l in the large-system limit. The three quantities are given via 
the cumulative distribution function (cdf) of (|32] |. 

FT{x;q)^PT{Ek{q)<x). (35) 

Note that the cdf ( |35T l is monotonically increasing, because of Zk,t ~ CJV{0, 1). Thus, there exists the inverse function of (|35T l, 
denoted by <?). 

Lemma 1. Lef ^K,T{q) denote the n-quantile for the cdf ([35i, 

C^,T{q)^F-HK;q). (36) 

Then, the Kth order statistic Ef^^^{q) for ( 132 1 ) converges in probability to the n-quantile ^k,t(9) in the large-system limit. 

Proof of Lemma Q} Since E^^j^-^ [q] is a sample K-quantile for the independent random variables (l32t with the common 
cdf ( [35] l, Bahadur's theorem 1,51] or its modification 0521 implies that 



(38) 



in the large-system limit, where FT{x;q) is the empirical cdf for (l32t . given by 

1 

FT(a:;<z) = -^l(i?fc(g)<a;). 
fe=i 

The mean and variance of the empirical cdf (l3Ft at a; = £,K..T{q) are given by 

E[i^T(CK,T(g);g)] = (39) 
V[FT(e.,T(g);g)] = ^^^^^, (40) 

respectively. Thus, the second term on the RHS of (|37] | is a quantity of 0(i4r^^/^). This observation implies that (ITtT i converges 
in probability to the K-quantile (|36] | in the large-system limit. ■ 

Lemma 2 (Stigler 1974). Let /^^ ^(9) flw^^ t(9) denote the mean of Ek and the variance of \/KEk, given by ( 1541 ), in the 
large-system limit. Then, 

l^^,T{q) = / Fj;\x;q)dx, (41) 



^lAl) ^ / [^T(min(x,y);g) 



[ 

-FT{x;q)FT{y;q)]dxdy, (42) 

where the n-quantile £,K,T{q) is given by ( 1361 ). 

Proof of Lemma \2}i The function \{k/K < k) in ( l34l ) is bounded and continuous almost everywhere (a.e.) F^^, since 
the cdf i35[ is monotonically increasing. Thus, we can use Stigler's theorems ll53l Theorems 1 and 3] to obtain (l4ll and (|42]) . 

■ 

We need calculate the cdf ([35) to evaluate the three quantities ( [36l ), ( 1411 1. and ( |42] |. See Appendix |A] for how to calculate 
the cdf I 
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B. Average Energy Penalty 

The average of the energy penalty (l2Tl i for US-VP (fTFi . denoted hy £ = E[£{{HiCi}, {xiCi,t})], is analyzed in the large- 
system limit. We use the replica method under the RS and IRSB assumptions |33|, |34|. Roughly speaking, the RS assumption 
corresponds to postulating that there are no local minimizers to the minimization (fTSl l in the large-system limit. On the other 
hand, the IRSB assumption is the simplest assumption for the case where there are many local minimizers in the large-system 
limit. 

Proposition 1. Suppose that qo is the solution to the fixed-point equation 

qo = anK.riqo), (43) 

where ^Ji-K,T{<l) is given by i41\l . If ( 1431 ) has multiple solutions, the smallest solution qo is selected. Under the RS assumption, 
the average energy penalty per selected user E / K converges to qo/(aK) in the large-system limit. 

Derivation of Proposition\I} See Appendix ICl ■ 

Proposition 2. Suppose that qi satisfies the coupled fixed-point equations 

In ( 1 + - ) = r I _ , (44) 

Ami) : (45) 

^ / 

for some < x < oo, in which <2mc/ ril) '^''^ given by \41^ and Ii42\l . respectively. If there are multiple solutions, 

the smallest solution qi is selected. Under the IRSB assumption, the average energy penalty per selected user £ / K converges 
to qi/{aK) in the large-system limit. 

Derivation of Proposition^ See Appendix ID] ■ 
The asymptotic energy penalty for VP was calculated with the R-transform for the empirical eigenvalue distribution of 
(HH^)^^ |[27l , ||30; l . Since it is difficult to apply this method to our case, another method is used in the calculation of 
the energy penalty, as presented in Appendix O Note that the meanings of RSB are different for the two methods. The two 
methods should yield the same result under the full-RSB assumption, since the full-RSB assumption is expected to provide 
the correct solution 1461 . Il47l . However, they may yield different results under the RS and IRSB assumptions, since these 
assumptions are approximations. In fact, the two methods seem to yield different results under the IRSB assumption, whereas 
the same result is obtained under the RS assumption. 

It is straightforward to show that the RS assumption provides a smaller prediction of the energy penalty than the IRSB 
assumption. 

Property 1. Let qo and qi denote the solutions defined in Proposition\l\and Proposition^ respectively. Then, 

qo<qi. (46) 
Proof of Prop ertyU} Eliminating o-'^xi^i) ^^^-^ (l44] l and (l45l l yields 

2x In + = aM.,T(gi). (47) 

V xJ x + qi 

We write the left-hand side (LHS) of (|47] | as f{qi,x)- It is straightforward to prove f{q,x) < Q for any q > and x > 0- 
Calculating the first and second derivatives of f(q,x) with respect to we obtain 

2/=21„fl + i)-fl±M, (48) 

OX \ xJ ix + ir 

— - = < 0. (49) 

dx^ xix + 

Since lim^^oo df /dx = 0, ( |48] l and ( |49] l imply df/dx > for any q > 0. Furthermore, lim^_j.oo /(f?, x) = Q indicates 
x) < Q for any q > and x > 0. 
Let us prove qo < qi. Since dTO is positive in q 0, from ( l43T l we find q < aiiK,,T{q) for any q £ (0, qo). Then, 

f{q,x) <q<af^K^Tiq), (50) 

for any q e (0, qo] and x > 0- This inequality implies that (l47T l has no solutions for any qi e (0, qo]. Thus, we obtain qo < qi. 

■ 

We next calculate the average energy penalty in T ^ oo to derive a performance bound for separate US -VP. 
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Corollary 1. Suppose that qo is the solution to the fixed-point equation 



(Jo = anK 



min \xk,t - \/qoZk,t\'^ 



(51) 



If ( 1571 ) has multiple solutions, the smallest solution qo is selected. Under the RS assumption, the average energy penalty per 
selected user E / K converges to qo / {an) /n T — > oo after taking the large-system limit. 

Proof of Corollary Q} See Appendix |E| ■ 
An informal derivation of (ISTI is as follows: First of all, one should recall that jj^.Tiq) have been defined as the mean of 
dSTl l in the large-system Umit. The weak law of large numbers implies that each term (|32] | in (l3Tl i converges in probability to 
the expectation E[i?fe(q)] in T — >^ oo, which is equal to the expectation on the RHS of ( BTT ) with qo = q. Thus, the minimization 
in dSTT i should make no sense in T ^ oo, i.e., ( |3TI ) should tend to KK[Ek{q)] in T ^ oo. This implies that the fixed-point 
equation ( l43T l reduces to ( BTT i in T — > oo. 

As noted in Remark [T] the energy penalty for the DD-US in T ^ oo provides a lower bound on that for the conventional 
(data-independent) US (l26l l. For the DD-US, it is possible to solve ( fSTT i. 

^ = (52) 

aK i — aK 

Comparing (fT4] l and ( [52] l. we find that the energy penalty for the DD-US in T — ^ oo is achievable by the ZFBF with random 
US (RUS), referred to as ZFBF-RUS in this paper, in which a subset of users with size K is selected randomly and uniformly. 
This observation under the RS assumption implies that the performance of the ZFBF with the optimal US is equal to that of 
the ZFBF-RUS in the large-system limit. Furthermore, from Property [T| we can conclude that the IRSB assumption yields a 
wrong result in T — oo. The same statements also hold for separate US-CVP: Under the RS assumption, the performance 
of separate US-CVP is equal to that of the CVP with RUS (CVP-RUS) in the large-system hmit. Furthermore, the IRSB 
assumption yields a wrong result in T — !• oo. 

One cannot conclude from the results under the RS and IRSB assumptions that conventional (data-independent) US makes 
no sense in the large-system limit, since there is a possibility that the full-RSB assumption provides a smaller energy penalty 
than the RS assumption. Unfortunately, it is difficult to calculate the energy penalty under higher-step RSB assumptions, so 
that whether this statement is correct should be checked by using another methodology. We leave this issue as future work, 
since it is beyond the scope of this paper. 



C. Sum Rate 

Before investigating the achievable sum rate of US-VP, the joint distribution of the indicator variable (l2Jt and the modified 
data symbols Xk = {S:k,t ■ t = 0, ... ,r — 1} given the data symbols Xk = {xk.t : t — 0, . . . , T — 1} is analyzed in the 
large-system limit. This joint distribution is used to calculate the achievable sum rate. 

Let {At} denote measurable subsets of C for t = 0, . . . , T — 1. The joint distribution is shown to be characterized via the 
conditional probability 

Pr iEk{q) < ^.,T{Q)Ail:T\l) G A} Xk) , (53) 



where Ek{q), ifc°J'*^((7), and ^K,T{q) are given by ( |32] |. ( l33T l, and (|36] |. respectively. 

Proposition 3. Suppose that qo is the same solution as in Proposition [7] Under the RS assumption, the conditional joint 
probability Pi'{sk — l,Xk& Ylt=o ^t\Xk) converges to ( 1531 ) with q — qo in the large-system limit. 

Derivation of Proposition\3} See Appendix |F] ■ 

Proposition 4. Suppose that qi is the same solution as in Proposition |2] Under the IRSB assumption, the conditional joint 
probability Pr{sk — 1, Afc € O^o^ A.t\Xk) converges to ( 1531 ) with q = qi in the large-system limit. 

Derivation of Proposition^ See Appendix IgI ■ 
It is straightforward to find Pr(sfc = 1) — k: Marginalizing ( l53T l yields 

Pr(sfc = 1) = Pr {Ek{q) < UAq)) , (54) 



which is equal to the cdf FT{£,K,Tiq)', q), given by (1351 1. The definition of the K-quantile ( |36] | implies Pr(sfe = 1) = k for any 
We next investigate the achievable sum rate of US-VP in the large-system limit. The achievable sum rate 7? is given by 

K 
k=l 
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where the achievable rate Rk for user k is defined as the mutual information per time slot between all data symbols A"^*^"^ for 
user k and all received signals 3^^'^"'' = {yk,t ■ t = 0, . . . ,Tc — 1} for user k 1541 . ||55l . 



Rk = lim 



^/('^r;3^rO- (56) 



In (l56b . the received signal yk.t is transmitted through the equivalent channel (l22T i. in which t depends on the data symbol 
Xk,t through the modified data symbol Xk^t € -^x^ f 

A crucial assumption in evaluating the achievable rate d5&t is the self-averaging property of the energy penalty (l2Ti in the 
large-system limit. 

Assumption 1 (Self-Averaging Property). The energy penalty per selected user for US-VP converges in probability to the 
expectation £ = E[£({i?jCi}i {^Jjc^,*})] in the large-system limit. 

The energy penalty corresponds to free energy in the low-temperature limit or equivalently ground state energy in statistical 
physics f33l, ['34] (See Appendix 0. Normalized ground state energy is believed to be self-averaging for many disordered 
systems. In fact, the self-averaging property of ground state energy was proved for a generalized spin glass model Il56l - ll58l 
and for MIMO systems |f59l . Since the proof of Assumption [T]is beyond the scope of this paper, we postulate the self-averaging 
property of the energy penalty in the large-system limit. 

The following lemma provides a genie-aided upper bound on the achievable sum rate ( |55l l in the large-system limit. 



Lemma 3. Suppose that Assumption\I] holds. Then, the achievable sum rate f l55l ) per transmit antenna R/N is bounded from 
above by 

C^^{H{K) + Kl{Xk;yk\sk^l)} (57) 

in the large-system limit, with Xk = {xk.t : t — . . . ,T — 1} and yk — {yk,t : t = 0, . . . , T — 1} denoting the data symbols 
and the received signals in the first block, respectively. In ( 1571 ), H{k) denotes the binary entropy function 

= -KlogK- (1 - K)log(l - k). (58) 

Proof of Lemma \3}i The received signal yk^t given by ( |22] | depends on all data symbols for user k through the energy 
penalty (|2T]) . Under Assumption [T] this dependencies disappear in the large-system limit; The equivalent channel (l22l l reduces 
to 

[P 

yk.t = W — {sk,iXk.t + (1 - Sk.i)Ik,t} + ?^fc,^ (59) 
\ q 

in the large-system limit, where q is equal to go or qi in Propositions [T] and |2] Since the US-VP ( fTsT i is performed block by 
block, the received signals ( |59] l are i.i.d. block by block. As a result, the achievable rate (|56] l reduces to 

Rk^^I{Xk;yk) (60) 

in the large- system limit. 

We next consider a genie-aided upper bound on (|60] |. in which a genie informs each user about whether he/she has been 
selected in the first block, 

Rk<^I{Xk;yk,Sk), (61) 

where Sk is the indicator variable (|23) to represent whether user k has been selected. In ( |6TI ), we have dropped the subscript i 
from Sk,i. The upper bound (1611 1 is formally obtained from the chain rule for mutual information 1,54] , 

I{Xk;yk,Sk) =I{Xk;yk) + I{Xk\ Sk\yk) 

>I{Xk\yk)- (62) 

Applying the chain rule for mutual information to the RHS of ( |6T] l yields 

I{Xk;yk,Sk) ^ I{Xk;sk) + I{Xk;yk\sk)- (63) 

Since Sk is a binary variable, the conditional entropy H{sk\Xk) is non-negative ||54l , so that the first term on the RHS of (|63T l 
is bounded from above by the entropy of Sk, 

I{Xk; Sk) = H{.Sk) - H{sk\Xk) < H{sk). (64) 

Since the prior probability Pr(sA; = 1) is equal to k, the entropy H{sk) is equal to the binary entropy function H{k). On the 
other hand, the second term should be equal to 

IiXk;yk\sk) = nIiXk;yk\sk = 1), (65) 
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where k is the probabiUty with which Sk takes 1. This can be understood as follows: The equivalent channel ( |59] l implies that 
user k receives the interference (l24l when = 0. In this case, the interference f does not contain the desired data symbols 
{i/c,t} for user k. Strictly speaking, there may be dependencies between the received signals for Sfc = and the desired data 
symbols, since the set of selected users depends on the desired data symbols for user k. However, this dependencies should be 
negligible in the large-system limit, so that we obtain ( l65T l. Combining (ISST i, (1611 1. (l63T l, ( |64] | with H{sk) = H{k), and ( l65T l, 
we arrive at the upper bound (l57t . ■ 

In the derivation of the upper bound (ISTT i. we have used the two upper bounds (1611 1 and (|64] |. The looseness of (ISTT i due to 
the latter bound is negligible as T oo, since T~^H{k) tends to zero. On the other hand, the genie-aided bound (I6l1 i also 
becomes tight as T oo, since the detection of Sk becomes easy as T increases. See [31] for an iterative algorithm to detect 
Sk- For example, the SNR loss required for detecting Sk is at most 0.5 dB for a sum rate per transmit antenna of 0.5 bps/Hz 
when T = 16 and QPSK are used. Furthermore, the SNR loss is at most 0.2 dB for 1 bps/Hz. These observations may imply 
that the upper bound i5% is reasonably tight for a few dozen T. 

As shown in Appendix |b1 Sk is independent of Xk for the DD-US with QPSK. As a result, the mutual information 
IiXk;yk\sk) in (|57]i reduces to 

HXk;yk\sk - 1) - KTI{xk,t;yk,t\sk - 1) (66) 

for the DD-US with QPSK. However, it is still hard to calculate the upper bound dSTl i for general modulation, since Sk depends 
on Afe. From Lemma |3] we obtain an upper bound for the DD-US that is possible to calculate for large T. 

Corollary 2 (Bound for DD-US). Suppose that Assumption Q] holds. Then, the achievable sum rate ( 1551 ) per transmit antenna 
R/N for the DD-US is bounded from above by 

Cdd-us = a 1 '^^^'* + Kl{xk,uyk,t\sk = 1)| , (67) 

where H{k) denotes the binary entropy function ( I5(SD . 

Proof of Corollary ^ It is sufficient from Lemma |3] to prove the following upper bound: 

I{Xk;yk\sk = 1) < TI{xk,uykAsk = 1). (68) 

By definition, 

i{Xk;yk\sk = 1) = h{yk\sk = i)-h{yk\Xk,sk = i). (69) 

Since Xk.t = Xk.t holds for the DD-US, the second term h{yk\Xk, Sk = 1) is equal to the differential entropy Th{nk.t) of the 
noise {rifc f} in (|59|F[ On the other hand, the conditional differential entropy h{yk\sk = 1) is bounded from above by the sum 
of the conditional differential entropy for each received signal ll54l . 

T-l 

M3^fc|sfc = l)< ^/i(2/Mkfc = l)- (70) 



(71) 



These observations imply that the RHS of ( l69l ) is equal to the upper bound 

We shall explain how to calculate the conditional mutual information I{xk.t',yk,t\sk — 1), which is given by 



I{xk,uyk,t\sk = 1) = E 

with 



log- 



p(2/fc,t|sfc = 1) 



p{yk,t\sk = 1) = ^xk,t [p{yk,t\xk,t,sk = 1)1 sfc = 1] , (72) 

where the conditional pdf p{yk,t\xk,t, Sk = 1) is given by 

p{ykd\^k,t,sk = 1) 

==Ei^^_^ [p{yk,t\^k,t,sk = l)\xk,t,sk = 1] ■ (73) 
In ( l73T l. the pdf p{yk,t\xk,t,Sk = 1) characterizes the equivalent channel 



p{yk,t\^kd,Sk = 1) =PCG yyk,t - \l —S:k,t;No ) , (74) 
with (|2]l, where q is equal to qo or qi in Propositions [T| and |2] 




^ This statement does not hold for the US-CVP. Consequently, we need to derive a tight lower bound of the second terra to obtain an upper bound on )57t . 
Unfortunately, we are unable to find such a tight lower bound that can be calculated easily. 
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Fig. 1. EjK versus olk = K /N for a = 4, T = 64, and Gaussian signaling. 

In order to calculate the expectations in (l72t and (l7?t . we need the joint posterior probability of Xk^t and Xk,t given Sk — 1, 
given by 



Pr(ife^f e A,xk^ e .4|sfc = 1) 
Pr(5fc = l,£fc^t € J^|xfc,t)Pr(xfc,f e ^) 
Pr(sfe = 1) 



(75) 



with measurable sets ^ C C and ^ G C. In ( fTSl ), Pr(a;fc.f e yl) denotes the prior probability of x^j,. Furthermore, the 
conditional probability Pr(sfc — \,Xk.t G A\xk,t) is characterized via Proposition |3] or Proposition 5] 



Pr(sA; = l,ife,t e A\xk,t) 

=Pr(i?fe(g)<eK,T(<?),4T^(g)G^ 



(76) 



with q = qo and q = qi under the RS and IRSB assumptions, respectively. In summary, it is possible to calculate the mutual 
information dTTT l from (l72] i-(l76t. See Appendix |B] for the details. 



IV. Numerical Results 

A. Energy Penalty 

US-VP is compared to ZFBF-RUS and CVP-RUS in terms of the average energy penalty. As noted in Section UlI-BI ZFBF- 
RUS and CVP-RUS provide lower bounds on the energy penalties of conventional US with ZFBF and separate US-CVP, 
respectively. The block length T is kept finite, while the coherence time Tc is implicitly assumed to tend to infinity. 

We first focus on the DD-US with Gaussian signaling Xk.t ^ CJ\f{0, 1). Figure [T| shows the average energy penalty per 
selected users £/K based on Propositions [T| and |2] For comparison, the energy penalties of ZFBF-RUS and CVP-RUS are also 
shown on the basis of Corollary [T] The energy penalty of CVP-RUS was originally evaluated in |[27l . Furthermore, we plot the 
energy penalty of a greedy algorithm for the DD-US with Gaussian signaling proposed in |[3T1 . We obtain three observations: 
First, the RS solution is indistinguishable from the IRSB solution for low-to-moderate an = K/N, whereas there are a gap 
between the two solutions for large aK. Secondly, the energy penalty of the greedy algorithm is close to the RS and IRSB 
solutions for low-to-moderate an. This observation implies that the two solutions can provide acceptable approximations for 
the actual energy penalty in the same region. Finally, the DD-US outperforms ZFBF-RUS and CVP-RUS for low-to-moderate 
an. Note that the energy penalty of CVP-RUS for finite-sized systems gets closer from above to the asymptotic one |30|, 
whereas the energy penalty of the DD-US gets closer from below to the asymptotic one, as shown in Fig. [T] This implies that 
the performance gap between the DD-US and CVP-RUS should be larger for finite-sized systems. 

We next focus on the average energy penalty of the DD-US with QPSK, shown in Fig. [21 For comparison, the energy penalty 
of the US-CVP is also shown on the basis of Propositions [T] and |2] Furthermore, we plot the energy penalty of the greedy 
algorithm for the DD-US with QPSK lEH. Three obsei-vations are obtained: First, the RS and IRSB solutions for the DD-US 
are indistinguishable from the respective solutions for the US-CVP in the low-to-moderate regime of an. Secondly, the RS 
and IRSB solutions for the US-CVP are close to each other for moderate-to-large aK, whereas there is a gap between the two 
solutions for small uk. Finally, the IRSB solution provides an acceptable approximation for moderate-to-large aK, while the 
RS solution does for small aK. 
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Fig. 2. £/k versus aK = K / N for a. = A, T = 64, and QPSK. 




Fig. 3. 6/ K versus T for a = 4, aK = 0.5, and Gaussian signaling. 



Finally, we investigate the impacts of T and a on the average energy penalty. Figure |3] shows the energy penalty of the 
DD-US with Gaussian signaling versus T. For small T, the energy penalty of the DD-US increases quickly as T grows. For 
moderate-to-large T, on the other hand, it increases slowly toward that for ZFBF-RUS. Figure |4] shows the energy penalty of 
the DD-US with Gaussian signaling versus a. The RS solution is indistinguishable from the IRSB solution, except for large 
a. We find that the gap between the analytical predictions and the energy penalty of the greedy algorithm ||3T1 becomes large 
as a increases. This may be due to the suboptimality of the greedy algorithm Ii3ll . 



B. Sum Rate 

The DD-US is compared to ZFBF-RUS, CVP-RUS, and the DPC without power allocation in terms of the achievable sum 
rate. For the DD-US, we use the upper bound (l67t on the achievable sum rate of the DD-US. The achievable sum rate of 
CVP-RUS was evaluated in ll30l . The achievable sum rate of the DPC without power allocation is equal to the sum capacity 
of a dual MIMO uplink fT4l with no power allocation. The sum capacity of the dual MIMO uplink is possible to calculate in 
the large-system limit Q. 

Before presenting the achievable rates, the distribution of the power of the modified symbol Xk,t given Sk = 1 is investigated 
for the DD-US with Gaussian signaling, which can be calculated via ^5^. Figure |5] shows the pdf of |ifc,tP given Sk = 1. 
For comparison, we also plot the prior pdf of the original data symbol Xk^t- The DD-US selects the data symbols with smaller 
power to reduce the energy penalty. Consequently, the pdf of the power of the modified symbol Xk^t has lighter tail than the 
prior pdf. This non-Gaussianity of the modified symbol results in a rate loss. 

Figure |6] shows the upper bound (l67l l on the achievable sum rate per transmit antenna of the DD-US. There is optimal an 
or equivalently the optimal number of selected users to maximize the sum rate for all schemes. This can be understood as 
follows: Increasing the number of selected users results in a degradation of the energy penalty and in an increase of spatial 
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Fig. 5. The pdf of given s^; = 1 for « = 4, T = 64, and Gaussian signaling. 



multiplexing gain. The latter effect is dominant for small an, whereas the former is for large an. Consequently, the sum rates 
are maximized at an optimal number of selected users. 

Figure |7] shows the upper bound (l67t with the optimal number of selected users. The RS and IRSB solutions for the DD-US 
with Gaussian signaling are close to each other Furthermore, the upper bounds for the DD-US with Gaussian signaling are 
larger than the achievable sum of CVP-RUS li30J for all transmit SNRs, while the DD-US with QPSK is comparable to CYP- 
RUS. Unfortunately, the upper bounds for the DD-US with Gaussian signaling are far from the achievable sum rate of DPC. 
For sum rates per transmit antenna of 0.5 bps/Hz and 1 bps/Hz, the DD-US with Gaussian signaling provides performance 
gains of 1.2 dB and 1.4 dB, respectively, compared to CVP-RUS. Note that the SNR loss required for detecting whether each 
user has been selected is ignored for the upper bound (|6Tt . The upper bound becomes tight as T grows. For example, the SNR 
loss for an iterative detection algorithm proposed in (TT] is 0.5 dB for a sum rate per transmit antenna of 0.5 bps/Hz when 
T = 16 and QPSK are used. Furthermore, the SNR loss is 0.2 dB for 1 bps/Hz. These results may imply that the DD-US with 
Gaussian signaling outperforms CVP-RUS in terms of the actual achievable sum rate. 



V. Conclusions 

Joint US-VP has been compared to separate US-VP in the large-system limit, where the numbers of transmit antennas, 
users, and selected users tend to infinity while their ratios are kept constant. The analyses under the RS and IRSB assumptions 
have shown that conventional (data-independent) US may make no sense in the large-system limit: Under the RS and IRSB 
assumptions, RUS achieves the same performance as optimal data-independent US in the large-system limit. Since conventional 
US is capacity-achieving as only the number of users tends to infinity, this implies that whether conventional US works well 
depends on how to take asymptotic limits. Joint US-VP can provide a substantial reduction of the energy penalty in the 
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Fig. 7. Optimized upper bound versus P/Nq for a = 4 and T = 64. 



large-system limit. Consequently, joint US-VP outperforms separate US-VP in terms of the achievable sum rate. In particular, 
DD-US can be applied to general modulation, and implemented easily with a greedy algorithm. 



Appendix A 
Calculation of (|35T ) 

A. Fourier Representation 

The cdf ( |35] | can be calculated via the characteristic function of (l32l i. Let Gt{^) denote the characteristic function of 



Since ( [32b is the sum of i.i.d. random variables, (177) is decomposed into 



2T 



with 



G{uj) = E 



(77) 

(78) 
(79) 



In (|79ll. Ma: is given by {x} for the DD-US and ^ for the US-CVP, respectively. 
It is well-known that the pdf of (l32b is given by the inverse Fourier transform 

p{Ek{q) ^E)^— GT{io)e-"^''dLo. 



(80) 



16 



IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. , NO. , 2012 



Integrating the pdf (|80] l from to x, we obtain 



Frix-q) = 



l-e" 



27riw 



-GT{<^)dui. 



Since 



(ISTT l reduces to 



oo eia;B.(g) .00 sin(cj£;fe ((?)) , 1 

du) ^ au! — — 



2' 



^t(x;<z) = --— / -G(— ) e—dc, 



2 27ri u 



(81) 



(82) 



(83) 



where we have used (fTHT l. It is possible to calculate ( l83T l numerically when the characteristic function ( |79^ is given. 



B. DD-US 

Let us calculate the characteristic function (|29} for the DD-US. Since ^/lq^[zk,t] ^ A/'(0, q) and TW^. = {x} for the DD-US, 
we obtain 



For QPSK K[a:fe,t] = ±1/V^, 



G(w) = E 



1 ^2iuJ^[xk,t? 



: CXp . 

^/l — 2iqu! \ 1 ~ 2iquj 



1 / lUJ 

: exp 



^/l — 2iquj \ 1 — 2i(7w 
For Gaussian signaling 9fi[xfc_t] ~ A/'(0, 1/2), ( |84] | reduces to 

G{^) = ^ =, 

V^l - 2i(l + gr)w 



(84) 



(85) 



(86) 



which is associated with the characteristic function for the chi-square distribution with one degree of freedom. In this case, 
the cdf (|35] | is associated with that for the chi-square distribution with 2T degrees of freedom: 



where 7(0, x) is the incomplete gamma function 



7(a, x) 



r(a) Jo 



Tx 



(87) 



(88) 



with r(a;) denoting the gamma function. 



C. US-CVP 

Let us calculate the characteristic function (l79l l for the US-CVP. From (fTTI i. we obtain 



G{uj) = E 



cxp ( icj min (i — z)^ 

a;G[l,oo) 



where the expectations are taken with respect to z ^ JV{0,q). Calculating the expectation yields 
In ( |90] l. Du denotes the standard Gaussian measure (|4|l. Furthermore, Q{x) is given by Q. 



(89) 



(90) 
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Appendix B 
Sum Rate for DD-US 

A. Calculation of K76\ 

The conditional probability ( |76] | for the DD-US can be calculated in the same manner as in Appendix lAl Let Gt(w, {wt}; Xk) 
denote the conditional characteristic function of i 



Xk 



where xl^t^\q) is given by ( [33] l. From ( [32] l. the characteristic function ( |9TT l is decomposed into 



t=0 



where G{uj,ujt;xk,t) is given by 



cxp <{ 2iw - y/qzk,t 



Xk,t 



Then, the conditional probability ( f76] l is given by 



with 



p{sk = l,Xk\Xk) 



p(sfe = 1, A'fe|A'fe)(iA'fe 



Xk,t 



27ria; 



n/( 



t=0 



2T 



S:k,t;xk,t] duj 



In ( |95] l. f{uJ,Xk,t',Xk,t) is defined as 



f{iO,Xk,uXk,t) = 



(27r)2 



with (|93]l. 

Let us calculate 



with 



for the DD-US to calculate (|94] |. In the same manner as in the derivation of 



G{LO]Xk^t) 



1 — 2iq(jj 



exp 



2\i^\xk.t\'^ 
1 — 2i(7w 



Substituting (|97]l into (|96ll yields 
which implies that (|95] ) reduces to 



Xk,uXk,t) = G{uj] Xk,t)S{xk,t - Xk.t), 



T-l 



p{sk = l,Xk\Xk) = Y\_ ^{^k,t - Xk,t) 

t=o 



\2T 



Xk,t 



'i"?K,T(«) 



-duj 



we obtain 



(91) 



(92) 



(93) 



(94) 



(95) 
(96) 

(97) 
(98) 
(99) 



(100) 



with ( |98] l. The expressions (|98] l and ( llOOl i imply that Sk is independent of Xk for the DD-US when QPSK jxfc = 1 is used. 
Substituting (llOOl i into dgUi, we find that (HHi for the DD-US is given by 



'1 1 



--i{xk,t e ^) 



2 27ri 



G (^]Xkf 



\2T' 



2(T-1) Q-i^i^.T{q) 



where G(w) and G{uj;xk.t) are given by ( l84l i and ( |98] l. respectively. 



(101) 
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B. Calculation of ( 1771 ) 

The conditional pdf ( |73] ) for the DD-US reduces to 

Piyk,t\^k,t, Sfc = 1) = piUk.tl^k^t = XkA,Sk = 1), 



(102) 



with ( |74] |. We shall evaluate the conditional pdf ( f72] i for Gaussian signaling Xk^t ^ C7V^(0, 1). Substituting (fTsT l with ( llOll i into 
(|72] l and then calculating the integration with respect to Xk,t, we obtain 



PiUkA^k = 1) 



1 



:PCG yk,t 



p 



p 
q 



Nr 



■PCG [ Vk£,—(T^ (^) +^0 



1 

27ri 

e" 



'OO 

G 

— OO 



\2TJ 



2T 



-duj 



OJ 



with 



1 — iquj 



(103) 



(104) 



In (11031 ). pcg{z', c^) and G{uj) are given by (|2|l and 
with ( fT02] i and (fT03T l. 



1 - i(l + 

respectively. It is possible to calculate the mutual information (fTTT l 



Appendix C 
Derivation of Proposition [T] 

A. Statistical Physics 

Before deriving Proposition [T] we shall present a brief introduction on statistical physics. Statistical physics elucidates 
macroscopic properties of many-body systems that consist of many microscopic elements with interaction. Let Si denote a 
variable that represents the state of the ith microscopic element for i ~ 1,...,N. Suppose that the interactions between 
the microscopic elements are characterized by Hamiltonian H{s), which is a real-valued function of the configuration s = 
{si, . . . , sat)^. It is known that the distribution of s is given by the so-called Gibbs-Boltzmann distribution with a positive 
parameter /3 > 0, 

Pr(s; (3) = Z{l3y^e-^"^^\ (105) 

with 

Z(/3) = ^e-'3^(^). (106) 
{«} 

The parameter (3 is called "inverse temperature." Let Sg denote the set of ground states {s} to minimize the Hamiltonian 
H{s). Only the ground states contribute to the Gibbs-Boltzmann distribution in the low-temperature limit /3 oo: The 
Gibbs-Boltzmann distribution (|105t converges to 

Pr(s;/?)^ ^I(se5g), (107) 
I'-'gl 

in the low-temperature limit /3 oo ll60l . 

The normalization constant ( IIO6I 1 is called "partition function," and is utilized to calculate several macroscopic quantities. 
As an example, let us calculate the internal energy {H{s))p, with (• • • )/3 denoting the expectation with respect to the Gibbs- 
Boltzmann distribution (1105b . We define the free energy as 

/(/3) = -^lnZ(/3), (108) 

with (1106b . It is straightforward to find that the internal energy is given by 

{H{s))0 = -^iPfm- (109) 

This implies that calculating the internal energy reduces to calculating the free energy. 

Since the Gibbs-Boltzmann distribution (1105b converges to ( 1107b in the low-temperature limit, the internal energy tends to 
the ground state energy, which is the minimum of the Hamiltonian H{s), in the low-temperature limit. The ground state energy 
is possible to calculate from (1108b directly: 

(i/(s))oo = lim /(/?). (110) 

/3— >-oo 



We will use the formula (II lOl t to calculate the energy penalty. 
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B. Formulation 

The average energy penalty £ — 'E.[£{{H!c^},{xic.^t})] for US-VP (fTsT i, given via dSTT l. is equal to the average £i of the 
energy penalty ( fT9] l for any block i. Without loss of generality, we focus on the first block i — Q and drop the subscripts i 
from £i, JCi, and s^.i. 

The asymptotic energy penalty for VP was analyzed with the R-transform for the empirical eigenvalue distribution of 
(HH^)^^ L27i . [|30 |. Unfortunately, it is difficult to apply this method to our case, since the empirical eigenvalue distribution 
of {H icH^)~^ is hard to calculate. Instead, we use the following lemma to calculate the average energy penalty £ without 
using the R-transform. 

Lemma 4. Let us define S — diagjsi, . . . , sk} and Xt = (ii.t , . . . , XK,t)'^ & Y\^=i -^xk t> with Sk given by l\23i . The energy 
penalty ( 1791 ) /or the first block i — is equal to 

fmin = lim minminmin;^HA(S', (111) 

A-s-oo {sfc} {xt} {ut} J 

where the minimizations with respect to {sk}, {xt}, and {itt} are over {0, 1}^, Jl^Io^ Oa^i t> and C^'^, respectively. 
In (inZll, nx{S, {At), {ut}), is given by 

T-l 

nx{S,{xt},{ut}) = \\ut\\' + \g{S,{xt},{ut}), (112) 

t=0 

with 

T-l 2 

<?(5,{it},{Mt}) = ^||5(lfMt-it)ll'+(TrS-x) . (113) 

i=0 

Proof of Lemma ^ Since the function ( II 131 ) is non-negative, the function ( II 12b is bounded in A ^ cx) only when 
g{S, {xt}, {ut}) = 0. This implies 

K 

Y^Sk^k, (114) 

fe=i 

wt = Jfg fiJ^cJf") \K,t^uf^\HK,iK,t), (115) 



where we have used JC — {k <^ /Call '■ Sfc — 1}^ obtained from (|23T l. Thus, (II 1 11 1 reduces to 

1 ZF 
/CC/Caii:|/C|=K{SK,t} J ^ II 

which is equal to the energy penalty (fT9] l with US-VP ( fTsT l. 
We start with defining the free energy as 



(116) 



I3KT 

where the so-called partition function A) is given by 



e 

NT 



Jl rfi^ duu (118) 
t=o t=o 

with ( III2I 1. Only the minimums of ( I112l l contribute to the free energy (II 17l l in /3 — > 00, so that taking the limit /3 — > cxo in 
(1117b before A — > 00 yields 

lim lim iE[f„ii„], (119) 

A— >C30 /3— >oo J\ 

which is equal to the average energy penalty per selected user £ /K for US-VP ( fTsT i from Lemma |4] Thus, calculating the 
average energy penalty is equivalent to evaluating the free energy ( II 17b . 



We use the replica method to calculate the free energy (II 17b in the large-system limit. The replica method is based on the 
identity 

/ = -lim--^lnE[Z(/3,A)"]. (120) 
"^0 /3uKT 
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Since the RHS is difficult to calculate for real u > 0, we regard u as a natural number to obtain a special expression for (1120) 
with ( fTTSl l. 

/ = -lini^4^1nZ„(/3,A), (121) 



■^^0 fiuKT 



with 



^n(/3,A) =E 



T-l 



T-l 



-/3«;,(S„,{X4,4,{U4,„}) 



Jl {dXt.adUt.a) 



t=0 



(122) 



In (11221) . Ut,a G C^, a;t_a G IlfcLi-^^fct' ^ {0: 1} denote replicas of the transmit vector Ut, the modified 

data symbol vector Xt, and the indicator variable Sk, respectively. Furthermore, the diagonal matrix Sa is given by Sa = 

diag{si^a,---,SK,a}- 



C. Average over Quenched Randomness 

We first evaluate the expectation in (I122l l with respect to the channel matrix H. Using (II 12l l yields 

j-l T-l 



a=0 t=0 



with 



4A({«*,a}) =IE 



ti-1 



T-l 



1=0 |^{sfc,ae{o,i}} t=o vlifc=i^="fc,t. 

T-l 

-/3Ag(S„,{xt,„},{ttt,a}) 



dxt 



where g{{sk,a}, {xt,a}, {ut,a}) is given by (II 13b . Let us define a random vector Va{k) £ C"^ as 



N 



(123) 



(124) 



(125) 



with Ua{n) — (u„ o,a, • ■ • , Un,T-i,a)"'^, in which Un^t.a — {ut,a)n dcuotcs the nth element of Ut.a- Since we have assumed that 
H has independent circularly symmetric complex Gaussian elements with variance v{k) — (t;o(fc)^, . . . , i'u-i(fc)"'")"'" 

conditioned on {ut.a} is a circularly symmetric complex Gaussian random vector with the covariance matrix 



1 ^ 



(126) 



n=l 



with u{n) = (Mo(n)^, . . . , The function dl 13b in (11241) depends on {ut.a} only through the covariance ma- 



trix (1126b . so that we can re-write (1124b as iQ) to find that ( 1123b reduces to 



uNT 



N 

n 

n=l 



J.NT 

lT 



In ([T271), E^^^iQ) is given by 



3-/3||M(n)|| 



K 



du(n) 



(127) 



a=0 l^{sfc,„e{0,l}} fc=l \''llt=o ^^k.t 
e-/3Ag(W,„},{*„(fc)},{.„(/c)})-Q^~^(^) 



fe=l 



(128) 
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g{{Sk,a} AXa{k)} ,{Va{k)}) 



K f ^ \ 

=Y,^kA\'"a{k)-Xa{k)f +\y^Sk^a~K\ , (129) 
k=l \k=l / 

with Xa{k) = ((*o,a)fc, • ■ • , {XT-I,a)kf- 

D. Average over Spin Variables 

We next calculate the integration in (1127b with respect to {tt(n)}. The expression (1127b implies that {tt(ri)} can be regarded as 
independent circularly symmetric complex Gaussian random vectors with the covariance matrix (3^^IuT- Thus, the covariance 
matrix (11 26b is regarded as a complex Wishart matrix ^50] with N degrees of freedom, so that the pdf of (1126) is given by 

P{Q) = C,e-'^^T'« dct Q^-"^, (130) 

with 

Cu = ^ ' ^ . (131) 

^uT(nT-l)/2-Q^J^(^„ j)! 

Replacing the integration in ( |127b with respect to {tt(n)} by the average over Q after substituting ( |127b into the free 
energy (1121b . we obtain 

/ =- lim — \^ In / Cu det Q""^ 

■ exp |/3iV In (Q) - 4(Q)^ | dQ 

' inf^V (132) 



with 



/3aK \P J ' 

/„(Q) = Trg-ilndetQ. (133) 
Assuming that the large-system limit and the limit u — > are commutative, we use the saddle-point method to arrive at 

lim / = lim ^$„(gj _ J- ln{ne), (134) 

with 

$„(Q) = UQ) - lim ^ InSg(Q), (135) 
where we have used the asymptotic formula for (1131b 

lim ^^lnC„ = -^ln(/3e)-Ko(l) (136) 

K~^oo i3uKT pan 

in the large-system limit. In ( 1134b . the limit limA'^oo denotes the large-system limit. Furthermore, is the solution to 
minimize (1135b : 

= argmin$„(Q). (137) 
Q 

E. Replica Symmetry Solution 

Let us assume RS for the solution ( |137b . 

Assumption 2 (Replica Symmetry). 



= (x/„+<Zol«ln)®^T- (138) 



We first calculate ( |133b to obtain 



u{x + Qo)- ln(x + uqo)- '^^-^ In X 



(139) 



^X + go-|^-^lnx, (140) 



in u 0. 
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We next evaluate ( |128l l. The RS assumption ( |138l l implies that (|125p is represented as 



(141) 



where {wa{k) e C"^} and {z{k) e C^} are independent circularly symmetric complex Gaussian random vectors with 
covariance It, respectively. We calculate the expectation with respect to {wa{k)} to obtain 



sg(gj = E[(4f)({^(fc)}))' 



(142) 



with 



Hr(W')))- E n(L.. 

{ske{Q.l}}k=l \"'llt=o 



Q-Hf>,''H{sk},{i:ik)}-Mk)}) Ji 



l[diik), 



(143) 



k=l 



where ^'^^'({sfc}, {S(fc)}, {2;(fc)}) is given by 



K 



pXsk 



■ PXskX 



\xik) - y/q;^z{k)\\ 



(144) 



Taking m — > yields 



lim lim 



ln4A^(Qs) 



1 



-E 



lnS(f)({^(fc)}) 



(145) 



with ( 1143b . Since (I144l l should be 0(/3) in /3 — ^ oo, x must be 0(/3 ^) in /3 — > oo. Taking /3 — > oo with ^ = /3x fixed before 
X ^ oo yields 



lim lim lim lim 



1 



A— >oo (3— i-oo M— yO K^oo j3uKT 

=-4 lim E[SR,s(go)], 



lnSg(QJ 



where £'rs(<;o) is given by 



with 



1 



K 



ERsilo) ^ ^ min V Sfe£;^^^^ (qo), 



T-l 



(146) 



(147) 



(148) 



In order to evaluate the expectation of ( 1147) . we write the order statistics of {Ej^ (^o)} {E^i.-^ (Qo)}, i-6- -^'(i) (^o) ^ 
E'(^^\qo) <■■■< E'l^fiqo) 148J. Since ( fT47] i can be represented as 



Ens{qo) = ^f2E^^^'\qo), 



k=l 



Lemma |2] implies 



lim E[£'rs('7o)] = MK,T(go), 



with (EB. 

Finally, we substitute (1140b and (1146b with ( 1150b into the free energy (1134b to arrive at 

go - a.lJ.K.T[qoy 



lim lim lim / = — (70 ^ 

A— yoo p—>-oo K^oo an 



(149) 



(150) 



(151) 



TAKEUCHI et a/. :PERFORMANCE IMPROVEMENT OF ITERATIVE MULTIUSER DETECTION FOR LARGE SPARSE CDMA BY SPATIAL COUPLING 



23 



where lim^_-j.oo denotes the limit in which P ^ oo and x ~^ with x = /3x fixed. In dlSll l, x go chosen so as to 
extremize the free energy (1151) . The stationarity condition for x implies that go is the solution to the fixed-point equation 

qo = aHt,^T{qo)- (152) 

Substituting (I152l l into the free energy (I151I I yields / = go /(at). 

If the fixed-point equation ( I152l i has multiple solutions, the solution go to minimize (1135b or equivalently the free energy 
is selected. Since the free energy is given by qo/{aK), this criterion is equivalent to selecting the smallest solution to the 
fixed-point equation ( 1152b . 



Appendix D 
Derivation of Proposition |2] 

We start with ( 1134b . Let us assume IRSB for the solution (1137b . 

Assumption 3 (1-step Replica Symmetry Breaking). 

Qs = [Xlu + golull + qilu/rm «> (Imil^J] «) It, 

for a positive integer mi satisfying u/mi G N. 
We first calculate ( 1133b to obtain 



(153) 



1 



r, I \ u{mi — 1) 

pu(x + qo + qi) inx 



mi 



1 ln(x + miQi) - ln(x + uqo + rriiqi) 

mi 



->x + 90 + gi 

-^inx, 



<7o 



/3(X + migi) Pmi 



In 1 + 



rriiqi 
X 



(154) 



(155) 



in u 0. 

We next evaluate (1128b . The IRSB assumption ( 11531 ) implies that ()125|) is represented as 

where {wa{k) £ C'^}, {z{k) e C"^}, and {zc{k) e C^} are independent circularly symmetric complex Gaussian random 
vectors with covariance It, respectively. We calculate the expectation with respect to {wa{k)} to obtain 



(156) 



4a (Qs) 



=E 



E 



{.oW}{s™({^(fc)},{^oW})™^} 



i/mi 



(157) 



with 



e 



>{{s^},{S:(k)},{^(k)},{^o(k}}} 



K 



{1 + p\xY^u=i^>^ 

where Hf^^^\{sk}, {x{k)}, {z{k)}, {zQ{k)}) is given by 



\{d&{k), 



(158) 



fe=i 



Hf^^''\{sk},{m},{^m,{^^{k)}) 

K 



k=l 



(3Xsk 



+ /3AsfeX 



\x{k) ~ y/qoz{k) - ^Zoik)\\ 



(159) 
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Taking u yields 



lim lim 



•E 



InE 



(160) 



with ( |158t . The function ( II6OI 1 converges in the limit /? 00, mi 0, and x — > with /ii = /3mi and x = f3x fixed. Taking 
this Hmit before A — > 00 yields 



J_lnS^"VQJ = lim 



lim lim lim lim 



1 



•E 



where -EiRSB(go, gi) is given by 



lnE{2„(fe)} |exp 

£^iRSB(go,gi) 
1 



X 



£^irsb(9o, qi) 



(161) 



A' 



with 



min ^ Vsfc-Bi^^^^H'Zo,'7i), 



1 ^"^ 



(162) 



(163) 



In order to evaluate the distribution of (I162K we write the order statistics of {E'^^^^^'' (go, 91)} as {£'|^|^^^'' (go, <Zi)}, i-C- 
— -^(2)^^^^ ' 91) — ■ ■ ■ — ^(icf^\lo,qi) ED. Expression (I162l i can be represented as 



fe=i 



(164) 



Since i?iRSB(<Zo, <Zi) conditioned on {x{k)} and {2;(fc)} converges in law to a Gaussian random variable in the large-system 
limit II53] Theorem 6], (1161b reduces to 

lim lim lim lim \^^ \nE^2'}{QJ 



— lim 

AT-i-oo 



[£'iRSB(go, <7i)] /iiTii:V[i;iRSB('Zo, gi)] 



Lemma |2] implies 



lim E[£'iRSB(go,'7i)] = AiK,T(go + gi), 

AT— >oo 

lim iirV[£;iRSB(go,'7i)] = crl rilo + n) 

A— >-oo ' 



(165) 

(166) 
(167) 



with dHJ and ( |42] |. 

Finally, we substitute (|155b and (|165t with ( II66I 1 and ( |167| l into the free energy (|134| i to arrive at 



lim lim lim / 

A— >-oo /3— >-oo A— foo 

1 



1 

- a 



go + qi 



Ml 90 



Ml lX + Mi9i 



In 



X + Mi^i 



X 



MiMK,T(go + gi) _ ^Mig^K^rCgo + gi) 



(168) 



where lim^„j.oo denotes the limit (3 00, mi — 0, and x ^ with /ii = /?mi and x = /^X fixed. In ( 11681 ), ^1, x> go^ ™d 
gi or equivalently /ii, x = x/m1' go^ and gi are chosen so as to extremize the free energy (1168) . The stationarity conditions 
for Hi and x are given by 



go 



X + gi 



In 1 



gi 



M>.,T(gO + gl) _ ^g^«,T(gO + gl) 

X 2x2 



(169) 
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-a 



(x + 9i)^ x{x + qi) 

1 T 

-^f^K,T{qo + qi) + rr^rcrl t(9o + 9l) 
X X^ 

respectively. From the stationarity conditions for go and qi, we obtain 

* 0, 



(170) 



(171) 



{x + qi? 

which implies <Zo — ^ 0, x ~^ co, or qi co. The free energy (1168) diverges in qi oo. Furthermore, the limit x ^ co 
corresponds to the RS solution. Taking go yields 



lim lim lim / = 

A— >oo /3— >-oo A'— >-oo CXK 



where qi satisfies the coupled fixed-point equations, 



qi 



In ( 1 + :ii ) = a 

X 



qi 



x + qi 



1 T 

-^^K,T{qi) - TT^crl^Aqi] 
X ^X 



T 



(172) 



(173) 



(174) 



_X X 

If the coupled fixed-point equations ( I173l l and ( 11741 ) have multiple solutions, the solution qi to minimize the free energy or 
equivalently the smallest solution qi to the coupled fixed-point equations is selected. 



Appendix E 
Proof of Corollary[T] 

We prove that the fixed-point equation (|43] | reduces to (ISTt in T — > oo. We first show that the K-quantile (|36] | converges to 
the expectation 



E[Ek{q)] =E 
in T — > oo. Let <^ denote a variable that satisfies 



l = FT{^;q) 



(175) 



(176) 



in T — > 00. Since the cdf ( [35] l is monotonically increasing, we find ^ > ^K,T{q), with ( l36l l. The weak law of large numbers 
implies that the random variable (l32t converges in probability to (I175l l in T — > oo, so that the cdf ( l35T l converges to 



lim FT{x;q) = l{x>E[Ek{q)]) 



(177) 
(178) 



in T ^> oo. Thus, (I176I I reduces to 

l = l{l>E[Ek{q)]) 

in T — > oo. The smallest variable £, satisfying ( 1178b is given by ( I175I I. This implies that (|36] | is bounded from above by (1175b 

in T — > oo: 

limsup^^,T{q)<E[Ek{q)]. (179) 



T->-oo 



Similarly, considering a variable f that satisfies 
in T oo, we obtain the lower bound on 



= FT(e;<z) 



limini UT{q)>E[Ek{q)]. 



Combining the two bounds ( |179b and ( 1181b yields 



lim CK.T(g) =E[i^fc(g)], 

T— foo 



(180) 
(181) 
(182) 



with ( 1175b . 



We next calculate (1411 1 in T — > oo. Integrating ( 1411 1 by parts after the transformation y = i^^, ^(x; g), we obtain 

K-<.t(«) 



Mk,t(<7) 



yFT{y\q)dy 



FT{x;q)dx, 



(183) 
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with (O. Applying ( fTTTl ) and ( fT82] ) to ( fT83] ) yields 

lim MK,T(g) = KE[£;fe(g)], 

T— i-oo 

with (1175b . This implies that (|43]l reduces to dST) in T ^ cx). 



(184) 



Appendix F 
Derivation of Proposition [3] 

A. Replica Method 

As shown in Appendix IC-AI the Gibbs-Boltzmann distribution ( 1105b converges to (1107b in the low-temperature limit /3 — > oo. 
This implies that the marginal distribution Pr(si; /3) = J2\si Pi'(s; /3) tends to 



1 



|5g(^)| 



(185) 



in the low-temperature limit, where Sg{i) is the set of the ith element Si included in the ground states Sg. Thus, evaluating 
the conditional joint distribution Pr(s/c — l,Xk & Y[t=o -^tlXk) reduces to calculating a marginal distribution of the Gibbs- 
Boltzmann distribution associated with the Hamiltonian (11121) . 
We start with the identity 



Pr Sfe = l,Xk ell At 



t=o 



p{sk = l,Xk\Xk)dXk, 



(186) 



with 



p{sk,Xk\Xk)= lim lim limZ(/3,A)'' 

A— >-oo /3— >-oo u— ^0 



•E 



E 

\Sk 



T-1 



{St}, {■"*}) 



Xh 



(187) 



where Tix{S, {wt}) and Z{/3, A) are given by (II 12b and (II 18b . respectively. In (1187b . J2\sk denotes the marginalization 
over {sk' G {0, 1} : fc' ^ k}. Furthermore, J d\Xk represents the marginalization over {xk' .t G ^x^., ^ ■ for all t and k' ^ k}. 
Regarding u in (11871) as a non-negative integer gives 



with 



p{sk,Xk\Xk) = lim lim lim Z„(sfc, A'fc, A'fe; /?, A), 

A— >-oo p— ^oo >-0 



Zu{sk, Xk, Xk] (3, A) 



(188) 



ATT 



(189) 



In ( 11891 ). the pdf p{Q) is given by (1130b . Furthermore, Sk,Xk,Xk) is defined as 



^flA (Q' *fc,0' '^'fc.O, '^k) — IE 



E /n-p{ 



'^\9{{skAAMk)},{Mk)})}d\Xk,o\Xk 



(190) 



with ( 1129b . where Xk^ is given by Xkfi = {5fc,t,o : t = 0, . . . , T - 1}. Substituting ( 1130b into ( |189b yields 



Zu{sk, Xk,Xk; (3, A) 



(7„ / dQdetQ 



-uT 



■ exp |/37V InSg (Q, s^, Xk,Xk) - UQ)^ | , 



(191) 
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with ( |133l l. Assuming that the large-system limit and the limits in (II88I 1 are commutative, we use the saddle-point method to 
obtain 



p{skT Xk\Xk) = lim lim lim lim < 

A— >oo /3— >oo «— >-0 if— >co L 
„-/3Ar7„(QJ^(«) 



(192) 



where is the solution to minimize ( I135l l, given by (I137l i. In the derivation of (|192| i. we have used the fact that the difference 
between \n'E!-p^{Q,Sk,Xk^Xk) and InS^A(Q) should be 0(1). 



B. Replica Symmetry Solution 

We evaluate ( |192| i under the RB assumption ( I138l l. The order parameter go satisfies the fixed-point equation (1152b . Further- 
more, from (1139) . it is straightforward to find that Iu{Qs) tends to zero in u ^ 0. 
We next calculate (|190t with ( 1141b to obtain 

lirn^ Jim E^^^^ {Q^ , Sk,o, Xkfl, Xk) 



= lim E 

K—yoo 



E 



^-Hf^\{sk,o}A5^o(k)}A^{k)}) 



d\Xk,, 



Xk 



(193) 



where E'j^f\{z{k)}) and H^^^\{sk,o},{^oik)},{z{k)}) are given by ( fl43] l and (fl44l i. respectively. Substituting (fT93] l into 



(194) 



192b and then taking /3 — > 00 with x = Px fixed before A — > cx), we have 

p{sk,Xk\Xk) 
= lim lim E eyi^\-^H^^^\sk,Xk)\ Xk 



with 



T-l 

H(^^\sk,Xk) - ^ E I^M - 



t=o 



min ^ ^ ..i^f ^^(.o) 



- E ^k'El^^\qo) 



(195) 



where £^^^^^(<Zo) is given by (1148b . The quantity ( 1195b is non-negative for any Sk and Xk, and zero if and only if {sk,Xk) is 



equal to the optimal solution (Sfc°^*^(go), {^iT*' (^o)}), given by 



^feT^('?")= argmin |ifc,t - V9o(^(fc))t| 



(196) 



4°^*^(9o)=argminL,4«^)(,o) 
sfce{o,i} I 



min _ V Sfc'^;f^^^ ^(go) > , 



(197) 



with ( |148b . Substituting ( |194b into ( |186b . we arrive at 



T-l 



A"*. ] ^ lim E 



Pr Sfc ^l,XkeY[Ai 

\ t=o 

i(4°p*)(go) = i) ni(4T'Me^. 



Xk 



(198) 



where i[°^*''((7o) ™d sjf *''('?o) given by ( |196b and ( |197b . respectively. 



,(opt) 
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In order to complete the derivation of Proposition |3] we prove that ( 11981 ) reduces to dSJt with q = qo- The solution (|197| i 



takes 1 if and only if EI (go) is smaller than the i^Tth order statistic, i.e. Ef. (go) < i'^o)- Lemma [T] implies that 

the A'th order statistic -S'^j^^'' (^o) converges in probability to the K-quantile ^k,t(<Zo), given by ( |36] |. in the large-system limit. 
This observation imphes that (|198t reduces to (|53T l with q = qa- 

Appendix G 
Derivation of Proposition |4] 

We start with (|192| i. Let us calculate (|192| i under the IRSB assumption (1153b . As shown in Appendix iDl qo tends to zero, 
and qi satisfies the coupled fixed-point equations (I173I I and ( I174l l. Furthermore, from (1154) . it is straightforward to find that 
luiQs) tends to zero in u — > 0. 

We next calculate (|190l l with ( |156t to obtain 

lim lim '^'}^^iQs, Sk^o, ^k,o, ^k) 



u— )-0 if — >oo 



= lim E 

K—^oo 



{s™({^(fc)},{^o(fc)})}"' E / A-t-M 



(199) 



(l + /3Ax)^^^'=i 

where s)j\^^^^({2(fc)}, {2;o(fc)}) and H^p^^'^^\{skfi}AMk)},{z{k)},{Mk)}) are given by dEl and (fT59] l. respectively. 
Substituting ( 11991 ) into ( 11921 ) and then taking the limit (3 — oo, mi — 0, and x ~^ with /ii = /3mi and x — Px fixed before 
taking A ^ oo, we have 



p{sk,Xk\Xk) 
= lim lim E 



exp|-|i7(ii^s«)(s,,i'fe)| 



(200) 



with 



i?(^^^«)(sfe, i-,) = ^ 5] life,, - v^(^o(fc)), 



T-l 



+ min V Sfe,4j«'^^'(0,gi) 



(201) 



where ij'^^^^' (g^^ gj^) is given by ( I163l l. The quantity (1201b is non-negative for any Sk and Xk, and zero if and only if {sk,Xk) 
is equal to the optimal solution (s^°''*^ (0, gi), {i[,°^*^(0, qi)}), given by 



5iT^(0,9i) = argmin \xk,t - ^/qI{zo{k))tf , 



(202) 



4°P*)(0,gi)= argmin <; s^i?^^"""^ (0, gi) 
Ske{o,i} I 



r(iRSB), 



+ min _ ^Sfc,i?™(0,gi)l, 



(203) 



with ( 1163b . Substituting ( 1200b into (1186b . we arrive at 



T-l 



A-fc = lim E 



Pr Sfc = l.i-fc e [] A 
\ t=o 



(204) 



where 4T^(0,gi) and s'-°'"\o,q,) are given by ( |202b and ( |203b . respectively. Repeating the argument in the end of 
Appendix IF-B I we find that ( 1204b reduces to (ISJb with q = qi. 
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